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CYCLES AND EIGENVALUES OF SEQUENTIALLY GROWING 
RANDOM REGULAR GRAPHS 

TOBIAS JOHNSON AND SOUMIK PAL 



Abstract. Consider the sum of d many iid random permutation matrices on 
n labels along with their transposes. The resulting matrix is the adjacency ma- 
trix of a random regular (multi)-graph of degree 2d on n vertices. It is known 
that the distribution of smooth linear eigenvalue statistics of this matrix is 
given asymptotically by sums of Poisson random variables. This is in contrast 
with Gaussian fluctuation of similar quantities in the case of Wigner matrices. 
It is also known that for Wigner matrices the joint fluctuation of linear eigen- 
value statistics across minors of growing sizes can be expressed in terms of 
the Gaussian Free Field (GFF). In this article we explore joint asymptotic (in 
n) fluctuation for a coupling of all random regular graphs of various degrees 
obtained by growing each component permutation according to the Chinese 
Restaurant Process. Our primary result is that the corresponding eigenvalue 
statistics can be expressed in terms of a family of independent Yule processes 
with immigration. These processes track the evolution of short cycles in the 
graph. If we now take d to inflnity, certain GFF-like properties emerge. 
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1. Introduction 

We consider graphs that have labeled vertices and are regular, i.e., every ver- 
tex has the same degree. We allow our graphs to have loops and multiple edges 
(such graphs are sometimes called multigraphs or pseudographs) . Additionally, our 
graphs will be sparse in the sense that the degree will be negligible compared to 
the order. Every such graph has an associated adjacency matrix whose (z,j)th 
element is the number of edges between vertices i and j, with loops counted twice. 
When the graph is randomly selected, the matrix is random, and we are interested 
in studying the eigenvalues of the resulting symmetric matrix. Note that, due to 
regularity, it does not matter whether we consider the eigenvalues of the adjacency 
or the Laplacian matrix. 

The precise distribution of this random regular graph is somewhat ad hoc. We 
will use what is called the permutation model. Consider the permutation digraphs 
generated by d many iid random permutations on n labels. We remove the direction 
of the edge and collapse all these graphs on one another. This results in a 2d- regular 
graph on n vertices, denoted by G{n, 2d). At the matrix level this is given by adding 
all the d permutation matrices and their transposes. 
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2 TOBIAS JOHNSON AND SOUMIK PAL 

Our present work is an extension of the study of eigenvalue fluctuations carried 
out in }DJPPll| . We are motivated by the recent work by Borodin on joint eigen- 
value fluctuations of minors of Wigner matrices and the Gaussian Free Field (GFF) 
[BorlOai IBorlOb] . Eigenvalues of minors are closely related to interacting particle 
systems [FerlOl IFFIO) . and the KPZ universality class of random surfaces |BF08) . 
See |JN06I for more on eigenvalues of minors of GUE and |ANvMlT| for those of 
Dyson's Brownian motion. 

Let us consider a particular but important case of Borodin's result in |BorlOa| 
(single sequence, the entire N). An n x n real symmetric Wigner matrix has iid 
upper triangular off-diagonal elements with four moments identical to the standard 
Gaussian. The diagonal elements are usually taken to be iid with mean zero vari- 
ance two. Notice that every principal submatrix (called minors in this context) 
of a Wigner matrix is again a Wigner matrix of a smaller order. Thus, on some 
probability space one can construct an infinite order Wigner matrix W whose nx n 
minor W{n) is a Wigner matrix of order n. 

Let z be a complex number in the upper half plane H. Define ?; = |z| and 
X = 25ft(z). Consider the minor T/F([nj/J), and let N{z) be the number of its 
eigenvalues that are greater than or equal to y^x. Define the height function 

(1) H,,iz) := ^J^Niz). 

Then Borodin shows that {Hn{z) — EHn{z), z e H}, viewed as distributions, 
converges in law to a generalized Gaussian process on H with a covariance kernel 

(2) c(z,u;) = ^ln 

The above is the covariance kernel for the GFF on the upper half plane. 

An equivalent assertion is the following. Let [n] denote the set of integers 
{1,2, ... ,n}. Consider the Chebyshev polynomials of the first kind, {T„, n — 
0,1,2,...}, on the interval [—1,1]. These polynomials are given by the identity 
Tn{cos{9)) = cos{n6). We specialize [BorlOai Proposition 3] for the case of GOE 
(/3 = 1). Fix m positive real numbers ti < t2 < . ■ . < tm- In the notation of 
[BorlOaj , we take L — n and Bi{n) — [[t^nj]. Then, for any positive integers 
ji, J2, ■■■,.im, the random vector 

(trTj, {W{lt,n\)/2y/t;7i) -EtrTj, {W{lt,n\)/2y/t;7Pl , i £ [m]) 

converges in law, as n tends to infinity, to a centered Gaussian vector with a co- 
variance kernel 

(3) \\m^Cov (tiT, (wi[tn\)/2Vt^y tiTk {Wi[sn\)/2y^)^ =^,^^(1)'' '. 

In particular, all such covariances are zero when i ^ k. Note that the traces can 
be expressed as integrals of the height function of the corresponding submatrices. 
Thus, by approximating continuous compactly supported functions of z by a func- 
tion that is piecewise constant in y and polynomial in x, one gets the kernel ([2|). 

1.1. Main results. By a tower of random permutations we mean a sequence of 
random permutations (7r^"\ n e N) such that 

(i) TT^"' is a uniformly distributed random permutation of [n] for each n, and 
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(ii) for each n, if tt^^"^ is written as a product of cycles then tt^"^^-' is derived 
from TT^"' by deletion of the element n from its cycle. 

The stochastic process that grows tt*^"^ from tt^"^^^ by sequentially inserting an el- 
ement n randomly is called the Chinese Restaurant Process (CRP). We will review 
the basic principles at a later section. In |KOV04| and other related work, a se- 
quence of permutations satisfying condition (ii) is called a virtual permutation, and 
the distribution on virtual permutations satisfying condition (i) is considered as a 
substitute for Haar measure on S'(oo), the infinite symmetric group. This is used 
to study the representation theory of S{oo), with connections to Random Matrix 
Theory. A recent extension of this idea is |BNNllj . 

Now suppose we construct a countable collection {lid, d G N} of towers of ran- 
dom permutations. We will denote the permutations in H^; by {tt^ , n E N}. Then 
it is possible to model every possible G{n, 2d) by adding the permutation matrices 
(and their transposes) corresponding to {ttJ" , 1 < j < d}. In what follows we 
will keep d fixed and consider n as a growing parameter. Thus, G„ will represent 
G{n, 2d) for some fixed d. Here and later, Gq will represent the empty graph. We 
construct a continuous-time version of this by inserting new vertices into G„ with 
rate n + 1. Formally, define independent times Ti ^ Exp(j), and let 



Mt 



;{to: ^T, <i}. 



and define the continuous-time Markov chain G{t) = Gmi ■ 

Our first result is about the process of short cycles in the graph process G(i). 
Let {G\^ {t), fc S N) denote the number of cycles of various lengths k that are 
present in G{s + 1). This process is not Markov, but nonetheless it converges to a 
Markov process (indexed by t) as s tends to infinity. 

To describe the limit, define 



i{d,k) = ^ ,^ , J, 



{2d — 1)'^ — 1 + 2d, when k is even, 
(2d - 1)'' + 1, when k is odd. 

Consider the set of natural numbers N = {1, 2, . . .} with the measure 



^.{k)^ -[a{d,k)-a{d,k-l)], fc e N, a(d,0):=0. 

Consider a Poisson point process x on N x [0, oo) with an intensity measure given 
on N X (0, oo) by the product measure /i(8)Leb, where Leb is the Lebesgue measure, 
and with additional masses of a{d, k)/2k on (fc, 0) for fc g N. 

Let Pj. denote the law of an one-dimensional pure-birth process on N given by 
the generator: 

Lfik)^k{f{k + i)-f{k)), fceN, 

starting from x e N. This is also known as the Yule process. 

Suppose we are given a realization of x- For ^-ny atom {k,y) of the countably 
many atoms of x, we start an independent process {Xk^y(t), t > 0) with law Pk- 
Define the random sequence 

Nk{t):^ E l{X,^y{t^y)^k}. 

{3,v)exn{lk]x[o,t]} 
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In other words, at time t, for every site fc, we count how many of the processes that 
started at time y < t at site j < k are cmTcntly at k. Note that both {Nk, k E N) 
and {Nk, k e [K]), for some K G N, are Markov processes. 

Theorem 1. The process {Cf!' (t), k G N, < t < oo) converges in law, in the 
product topology on D°°[0,oo), to the Markov process {Nk{t), fc S N, <t < oo). 
The limiting process is stationary. 

Remark 2. In fact, the same argument used to prove Theorem [1] shows that the pro- 
cess {Ck (t), — oo < i < oo) converges in law to the Markov process {Nk{t), — oo < 
t < oo) running in stationarity. The same conclusion holds for all the following the- 
orems in this section. 

We now explore the joint convergence across various d's. Define C^l.{t) naturally, 
stressing the dependence on the parameter d. 

Theorem 3. There is a joint process convergence of {C^"/: {t) , fc G N, i G [rf], t > 0) 
to a limiting process [Ni^k{t), k € N, i E [d], t > 0). This limit is a Markov process 
whose marginal law for every fixed d is described in Theorem [II Moreover, for 
any d E N, the process {Nd+i.ki') ~ Nd.ki'), k E N) is independent of the process 
(Ni^ki'), fc G N, i G [d]) and evolves as a Markov process. Its generator (defined on 
functions dependent on finitely many coordinates) is given by 

oo oo 

Lf{x) = ^ kxk [f {x + Ck+i - ek) - f{x)] + ^ iy{d, k) [f{x + e^) - f{x)] , 
fc=i fc=i 

where x is a nonnegative sequence, (ek, fc G N) are the canonical orthonormal basis 
of f' and 

v{d, fc) = - [a{d + 1, fc) - a{d + 1, fc - 1) - a(d, k) + a{d, fc - 1)] . 

Remark 4. Theorems [T] and [3] show an underlying branching process structure. We 
actually prove a more general decomposition where cycles are tracked by edge labels. 
The additive structure also imparts a natural intertwining relationship between the 
Markov operators. See |CPY98[ Section 2] and |DF901lBorT0al . 

We now focus on eigenvalues of G{t). Note that there are no easy exact relation- 
ship between the eigenvalues of G„ for various n since the eigenvectors play a role 
in determining any such identity. In fact, the eigenvalues of G„ and Gn+i need not 
be interlaced. However, one can consider linear eigenvalue statistics for the graph 
G{n, 2d). That is, for any d-regular graph on n vertices G and function /: R -> R, 
define the random variable 

n 

tr/(G):=^/(AO 

where Ai > ... > A„ are the eigenvalues of adjacency matrix of G divided by 
2{2d — 1)^/'^. The scaling is necessary to take a limit with respect to d. 

By a polynomial basis we refer to a sequence of polynomials {/o = 1, /i, /2, • . •} 
such that fk is a polynomial of degree fc of a single argument over reals. In the 
statement below [oo] will refer to N. 

Theorem 5. There exists a polynomial basis {fi, i G N} (depending on d) such 
that, for any iiT G N U {oo}, the process (tr /fc(G(s -I- i)), k G [K], t > 0) converges 
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in law, as s tends to infinity, to the Markov process {Nk{t), k G [K], t > 0) of 
Theorem,[I\ Hence, for any polynomial f , the process [ti f{G{s + t))) converges to 
a linear combination of the coordinate processes of (Nk (i) , fc € N) . 

The Markov property is especially intriguing since, to the best of our knowledge, 
no similar property of eigenvalues of the standard Random Matrix ensembles is 
known. For the special case of minors of the Gaussian Unitary/ Orthogonal Ensem- 
bles, the entire distribution of eigenvalues across minors of various sizes do satisfy 
a Markov property. However, this is facilitated by the known symmetry properties 
of the eigenvectors, and do not extend to other examples of Wigner matrices. 

For our final result we will take d to infinity. We will make the following no- 
tational convention: for any polynomial /, we will denote the limiting process of 
(tr /(G(s + t)), t > 0) by (tr / (G'(oo + i)) , t > 0). Recall that this process is a 
linear combination of {Nk{t), k E N, t > 0). 

Theorem 6. Let {Tk, k G N} denote the Chebyshev orthogonal polynomials of the 
first kind on [—1, 1]. Then, for any choice of ti < t2 < ■ ■ ■ < tm and any collection 
of positive integers ji, J2, ■ ■ ■ , jm , the random vector 

{tiTj^ (G(oo + t,)) - EtrTj-, (G(oo + t,)) , i e [m]) 

converges, as d tends to infinity, to a centered Gaussian vector with a covariance 
kernel 

(4) lim Cov(trT,(G(oo + t)),trrfc(G(oo + .s))) = <5,4e'=^'~*^. 

In fact, the collection of processes 

(trrfc(G(oo + i))-EtrTfe(G(oo + t)), t > 0, A: g N) 

converges weakly in Z)°°[0,oo) to a collection of independent Ornstein-Uhlenbeck 
processes {Uk(t), i > 0, k CzN), running in equilibrium. Here the equilibrium dis- 
tribution of Uk is N(0, fc/2) and Uk satisfies the stochastic differential equation 

dUk{t) = -kUk{t)dt + kdWk{t), t>0, 

and {Wk, k G N) are iid standard one- dimensional Brownian motions. 

A comparison of (jj]) with Borodin's result ([3]) shows that the above limit captures 
a key property of the GFF covariance structure. The appearance of the exponential 
is merely due to a deterministic time-change of the process. A somewhat more 
detailed discussion can be found in the following section. 

1.2. Existing literature. The study of the spectral properties of sparse regu- 
lar random graphs is motivated by several different problems. These matrices do 
not fall within the purview of the standard techniques of Random Matrix Theory 
(RMT) due to their sparsity and lack of independence between entries. However, ex- 
tensive simulations ( |JMRR99] ) point to conjectures that these matrices still belong 
to the universality class of random matrices. For example, it is conjectured via simu- 
lations QMNSOS] ) that the distribution of the second largest eigenvalue (in absolute 
value) is given by the Tracy- Widom distribution. In the physics literature, eigen- 
values of random regular graphs have been considered as a toy model of quantum 
chaos ( |SmilO| . |OGS09) . |OS10) ). Simulations suggest that the eigenvalue spacing 
distribution has the same limit as that of the Wigner matrices. A limiting Gaussian 
wave character of eigenvectors have also been conjectured ( |Elo08[ IElolO| lESlOj ). 
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Some fine properties of eigenvalues and eigenvectors can indeed be proved for a 
single permutation matrix; see [WieOOj and [BADllj . 

Somewhat complicating the matter is the fact that when the degree d is kept fixed 
and we let n go to infinity, several classical results about random matrix ensembles 
fail. A bit more elaboration on this point is needed. The two parameters in the 
ensemble of random graphs are the degree d and the order n. In the permutation 
model it is possible to construct random regular graphs for every possible value of 
(d, n) where d is an even positive integer and n is any positive integer. Hence one can 
consider various kinds of limits of these parameters. We will refer as the diagonal 
limit the procedure of having a sequence of (d, n) where both these parameters 
simultaneously go to infinity. To maintain sparsitjo, it is usually assumed that 
d is at most poly-logarithmic in n. No lower bound on the growth rate of d is 
assumed. However, results are often easier to prove when d is kept fixed and we let 
n go to infinity. Suppose for each d one gets a limiting object (say a probability 
distribution) ; one can now take d to infinity and explore limits of the sequence of 
these objects. We will refer to this procedure (limd-s-oo lini„_).oo) as the triangular 
limit. The triangular limit is often identical to the diagonal limit irrespective of 
the sequence through which the diagonal limit is taken, while maintaining sparsity. 
Moreover, these limiting statistics frequently match with those of the GOE ensemble 
and the real symmetric Wigner matrices. This is true, for example, for the empirical 
spectral distribution |DP101 ITVWIO] and fluctuations of smooth linear eigenvalue 
statistics [DJPPTT] . 

Our present result is a triangular limit result. One of the reasons why we cannot 
prove a full GFF convergence is that the parameters d and n behave independently 
of one another. The degree d determines the support of the spectral distribution 
[—2\/2d — 1, 2\/2d— 1], asymptotically independent of n. For Wigner matrices, the 
dimension itself determines the length of the spectral support. This results in the 
parametrization of ([T]). A corresponding parametrization in our case is not obvious. 

2. Preliminaries 

2.1. A primer on the Chinese Restaurant Process. The CRP, introduced by 
Dubins and Pitman, is a particular example of a two parameter family of stochas- 
tic processes that constructs sequentially random exchangeable partitions of the 
positive integers via the cyclic decomposition of a random permutation. Our short 
description is taken from |Pit06[ Section 3.1]. 

An initially empty restaurant has an unlimited number of circular tables num- 
bered 1,2,... each capable of seating an unlimited number of customers. Customers 
numbered 1,2,... arrive one by one and are seated at the tables according to the 
following plan. Person 1 sits at table 1. For n > 1 suppose that n customers have 
already entered the restaurant, and are seated in some arrangement, with at least 
one customer at each of the tables j for 1 < j < fc (say), where k is the number 
of tables occupied by the flrst n customers to arrive. Let customer n + 1 choose 
with equal probability to sit at any of the following n + 1 places: to the left of 
customer j for some 1 < j < n, or alone at table fc + 1. Define tt*^"-' : [n] — )• [n] as 
the permutation whose cyclic decomposition is given by the tables; that is, if after 
n customers have entered the restaurant, customers i and j are seated at the same 



The non-sparse can be typically absorbed within standard techniques of RMT by comparing 
with a corresponding Erdos-Renyi graph whose adjacency matrix has independent entries. 
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7r2 



TTl 




Figure 1. A cycle whose word is the equivalence class of 
7r27rf -^7r27ri7r27r3^"^ in 'Wq/Di2. 



table, with i to the left of j, then 7r'^"^(i) — j, and if customer i is seated alone 
at some table then 7r'^"^(i) = i. The sequence (tt*^"^) then has features (i) and (ii) 
mentioned in the first paragraph of Section 11.11 

2.2. Combinatorics on words. The graph Gn, formed from the independent 



(") 



Jn) 



,TT^ , can be considered as a directed, edge-labeled graph in 



permutations tt 

a natural way. For convenience, drop superscripts and let tt; = t^i'^' ■ If '^li'i) — j, 
then by definition G„ contains an edge between i to j. When convenient, we 
consider this edge to be directed from i to j and to be labeled by ttj. 

Consider a walk on Gn, viewed in this way, and imagine writing down the label 
of each edge as it is traversed, putting tt^ or n~^ according to the direction we walk 
over the edge. If the walk forms a cycle, then the resulting word w = wi ■ ■ ■ Wk is 
cyclically reduced, i.e., Wi ^ 'w~^-^ for all i, considering i modulo k. 

Let Wk denote the set of cyclically reduced words of length k. We would like to 
associate each fc-cycle in G„ with the word in Wk formed by the above procedure, 
but since we can start the walk at any point in the cycle and walk in either of 
two directions, there are actually up to 2fc different words that could be formed by 
it. Thus we identify elements of Wk that differ only by rotation and inversion (for 
example, ttitt^ 7i-i7r2 and ttj" 7r27rf tt^ ) and denote the resulting set by Wk/D2k, 
where D2k is the dihedral group acting on the set Wk in the natural way. 

Definition 7 (Properties of words). For any fc-cycle in G„, the element of Wk/D2k 
given by walking around the cycle is called the word of the cycle (see Figure [T|) . 
For any word w, let \w\ denote the length of w. Let h{w) be the largest number 
m such that w = m™ for some word u. If h{w) — 1, we call w primitive. For any 
w G Wk, the orbit of w under the action of I?2fc contains 2k/h(w) elements, which 
we will frequently use. Let c{w) denote the number of pairs of double letters in 
w, i.e., the number of integers i modulo \w\ such that Wi — Wi+i. For example, 
c(7ri7ri7r^^7r^^7ri) = 3. We will also consider | • |, /i(-), and c(-) as functions on 
Wk/D2k, since they are invariant under cyclic rotation and inversion. 

To more easily refer to words in Wk/D2k, choose some canonical representative 
wi-'-Wk G Wk for every w G Wk/D2k- Based on this, we will often think of 
elements of Wk/D2k as words instead of equivalence classes, and we will make 
statements about the «th letter of a word in Wk/D2k- For w — wi ■ ■ ■ Wk G Wk/D2k, 
let m;^*) refer to the word in Wk+i/D2k+2 given by wi • • • WiWiWi+i ■ ■ ■ Wk- We refer 
to this operation as doubling the ith letter of w. A related operation is to halve a 
pair of double letters, for example producing 7ri7r27r37r4 from it 17^2"^ ^t^at^i- (Since 
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we apply these operations to words identified with their rotations, we do not need 
to be specific about which letter of the pair is deleted.) The following technical 
lemma underpins most of our combinatorial calculations. 

Lemma 8. Let u G yVkl^ik ond w G Wfe+i/£'2fc+2- Suppose that a letters in u 
can be doubled to form w, and b pairs of double letters in w can be halved to form 
u. Then 

a b 



h(u) h{w) 

Remark 9. At first glance, one might expect that a — b. The example u — 
7ri7r27ri7ri7r2 and w — 7ri7ri7r27ri7ri7r2 shows that this is wrong, since only one letter 
in u can be doubled to give w, but two different pairs in w can be halved to give u. 

Proof. Let Orb(u) and Orb(?x;) denote the orbits of u and w under the action of 
the dihedral group in Wk and Wfc+i, respectively. When we speak of halving a 
pair of letters in a word in Orb(w), always delete the second of the two letters (for 
example, 7ri7r27ri becomes 7ri7r2, not 7r27ri). When we double a letter in a word 
in Orb(w), put the new letter after the doubled letter (for example, doubling the 
second letter of ttiTT^^ gives ttiTt^^tt^^, not tt^^ttitt^^.) 

For each of the 2k/h{u) words in Orb(M), there are a doubling operations yielding 
a word in Orb(w). For each of the {2k + 2)/h{w) words in Orb(-u;), there are b 
halving operations yielding a word in Orb(u). For every halving operation on a 
word in Orb(w), there is a corresponding doubling operation on a word in Orb(M) 
and vice versa, except for halving operations that straddle the ends of the word, as 
in 7ri7r27ri. There are 2b/h{w) of these, giving us 

2ka _ {2k + 2)b 26 
h{u) h{w) h{w) 

2kb 



hiw)' 
and the lemma follows from this. D 



Let W = U^i Wfc/L'zfc, and let W^ = Uf=i Wfe/i:'2fc. We will use the previous 
lemma to prove the following technical property of the c(-) statistic. 

Lemma 10. In the vector space with basis {qw}w£W' -, 

Proof. Fix some w G 'Wk/P>2k: and let a{u) denote the number of letters of u that 
can be doubled to give w, for any u e 'Wk-i/D2k-2- We need to prove that 

Ea{u) c{w) 
h{u) " h(w) ■ 

Let b(u) be the number of pairs in w that can be halved to give u. By Lemma [51 

Ea{u) -s—^ b(u) 

h{u} " ^ h(w)' 

and E„GW._i/D2.-2 ^(") = '^("')- ° 
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Figure 2. The vertex 6 is inserted between vertices 2 and 3 in tti, 
causing the above cycle to grow. 





^^ 


6 




1 ^2 2 '^l 3 ^"2 4 TTl 5 

^1 - (2 3 1)(4 5) 
7r2 == (2 1 3 4 5) 


1 ^2 2 ^^3 3 7^2 4 TTl 5 

TTl = (2 3 1 6)(4 5) 
7r2 = (2 1 3 4 6 5) 



Figure 3. A cycle forms 
inserted into the graph. 



'spontaneously" when the vertex 6 is 



3. The process limit of the cycle structure 

As the graph G(t) grows, new cycles form, which we can classify into two types. 
Suppose a new vertex numbered n is inserted at time t, and this insertion creates a 
new cycle. If the edges entering and leaving vertex n in the new cycle have the same 
edge label, then the new cycle has "grown" from a cycle with one fewer vertex, as in 
Figure [31 If the edges entering and leaving n in the cycle have different labels, then 
the cycle has formed "spontaneously" as in Figure [S] rather than growing from a 
smaller cycle. This classification will prove essential in understanding the evolution 
of cycles in G{t). 

Once a cycle comes into existence in G{t), it remains until a new vertex is inserted 
into one of its edges. Typically, this results in the cycle growing to a larger cycle, 
as in Figure [2] If a new vertex is simultaneously inserted into multiple edges of the 
same cycle, the cycle is instead split into smaller cycles as in Figure ID These new 
cycles are spontaneously formed, according to the classification of new cycles given 
in the previous paragraph. Tracking the evolution of these smaller cycles in turn, 
we see that as the graph evolves, a cycle grows into a cluster of overlapping cycles. 
However, it will follow from Theorem [T51 that for short cycles, this behavior is not 
typical. Thus in our limiting object, cycles will grow only into larger cycles. 



10 



TOBIAS JOHNSON AND SOUMIK PAL 



4 


7r2 3 


V 


\ TTl 


< 


J^ 2 


7r2 


^Nk^/^TTl 




1 


TTl = 


(1 2 3)(4 5) 


7r2 = 


(15)(4 3)(2) 



4 7r2 3 


'i^" 





71-2 Z' 2 


i^-1 


1 


TTl = (1 2 6 3)(4 5) 


712 = (1 5 6) (4 3) (2) 



Figure 4. The vertex 6 is inserted into the cycle in two different 
places in the same step, causing the cycle to split in two. Note 
that each new cycle would be classified as spontaneously formed. 



3.1. Heuristics for the limiting process. We give some estimates that will 
motivate the definition of the limiting process in Section r3.2l This section is entirely 
motivational, and we will not attempt to make anything rigorous. 

Suppose that vertex n is inserted into G(i) at some time t. First, we consider 
the rate that cycles form spontaneously with some word w £ 'yVk/D2k- There 
are 2k/h{w) words in the orbit of w under the action of D2kj and out of these, 
2{k — c{w))/h{w) have nonequal first and last letters. For each such word u = 
ui ■ ■ ■ Uk, we can give a walk on the graph by starting at vertex n and following the 
edges indicated by u, going from n to ui(n) to U2{ui{n)) and so on. If this walk 
happens to be a cycle, the condition ui 7^ Uk implies that it would be spontaneously 
formed. 

In a short interval At when G{t) has n — 1 vertices, the probability that vertex 
n is inserted is about n At. For any word u, the walk from vertex n generated by u 
is a cycle with probability approximately l/n. Any new spontaneous cycle formed 
with word w will be counted by one of these walks, with u in the orbit of w, and 
it will be counted again by the walk generated by w^ ■ ■ ■ Ui . Thus the expected 
number of spontaneous cycles formed in a short interval At is approximately 



1 nAt 
[k — c[w)) 



1 



h{w) 



h{w) 



(k - c{w)) At. 



Thus we will model the spontaneous formation of cycles with word w by a Poisson 
process with rate (fc — c{w))/h(w). 

Next, we consider how often a cycle with word w g Wfe grows into a larger cycle. 
Suppose that G{t) has n — 1 vertices, and that it contains a cycle of the form 




Sfc-l 
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When vertex n is inserted into the graph, the probabiUty that it is inserted after 
Si-i in permutation Wi is 1/n. Thus, after a spontaneous cycle with word w has 
formed, we can model the evolution of its word as a continuous-time Markov chain 
where each letter is doubled with rate one. 

3.2. Formal definition of the limiting process. Consider the measure /i on W" 
given by 

Consider a Poisson point process x on W" x [0, oo) with an intensity measure given 
by the product measure ^ (g) Leb, where Leb refers to the Lebesgue measure. Each 
atom (w, t) of X represents a new spontaneous cycle with word w formed at time t. 

Now, we define a continuous-time Markov chain on the countable space W" 
governed by the following rates: From state w G Wk/D2k, jump with rate one to 
each of the k words in yVfc+i/I?2fc+2 obtained by doubling a letter of w. If a word 
can be formed in more than one way by doubling a letter in w, then it receives a 
correspondingly higher rate. For example, from w ~ 7ri7ri7r2, the chain jumps to 
7ri7ri7ri7r2 with rate two and to 'iti'Ki'K2T^2 with rate one. Let P^^, denote the law of 
this process started from w S W. 

Suppose we are given a realization of x- For ^uy atom (w, s) of the countably 
many atoms of x? we start an independent process {X^j^g{t)^ t > 0) with law P^. 
Define the stochastic process 

(u,s)£x 
s<t 

Interpreting these processes as in the previous section, N^ (t) counts the number of 
cycles formed spontaneously at time s that have grown to have word w at time t. 
The fact that the process exists is obvious since one can define the countably 
many independent Markov chains on a suitable product space. The following lemma 
establishes some of its key properties. 

Lemma 11. Recall that W^ = ljfc=i y^k/D2k- We have the following conclusions: 
(i) For any L G N, the stochastic process {(A^tu(i), w G W^), t > Q\ is a time- 
homogeneous Markov process with respect to its natural filtration, with RCLL 
paths. 

(a) Recall that for w G Wk/ D2k, the element w*^*-' G Wk+i/ D2k+2 is the word 
formed by doubling the ith letter of w. The generator for the Markov process 
{{N^{t), w G yV£), i > 0} acts on f at x ^ (x^, w G W^) by 

\w\ 

^fi^)= Yl Y^^o[f{x-ey, + e^(^))- f{x)] 
weWj^ 4=1 

■^-^ \w\ — c{w) 
h{w) 



+ E ^^\7rT^[/(- + ^») -/(-)] 



weWr 



where Cw is the canonical basis vector equal to one at entry w and equal to 
zero everywhere else. For a word u of length greater than L, take e„ ~ 0. 
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(in) The product measure of Foi{l/ h{w)) over all w G W^ is the unique invariant 
measure for this Markov process. 

Proof. Conclusion (i) follows from construction, as does conclusion (ii). To prove 
conclusion (iii) , we start by the fundamental identity of the Poisson distribution: if 
X ^ Poi(A), then for any function /, we have 

(5) ^XgiX) = XEg{X + 1). 

We need to show that if the coordinates oi X = {Xw, w E W^) are independent 
Poisson random variables with EX^ = l/h{w), then 

(6) E/:/(X) = 0. 

Since the process is an irreducible Markov chain on countable state space, the 
existence of one invariant distribution shows that the chain is positive recurrent 
and that the invariant distribution is unique. 

To argue ([6]) we will repeatedly apply identity ([5|) to functions g constructed from 
/ by keeping all but one coordinate fixed. Thus, for any w e W^ and 1 < i < \w\, 
we condition on all Xu with u ^ w and hold those coordinates of / fixed to obtain, 

EX,,/ {X - e^ + e^(.) ) = -^7^E/ {X + e^(o ) 

n(w) 

taking e^(i) = when \w\ = L. In the same way, 

h{w) 
By these two equalities, 

\w\ 

\M ^ 
= E E/7^E[/(X + e^,,)-/(X + e„)] 

- E Prmx+e^). 

Specializing Lemma [TU] to q^u = E/(X + e^u), the first sum is 

E E7;7^E/(X + e^,))= E |rTE/(^ + ^™)' 

^—' ^—' hiw) ^—' hyw) 



CYCLES AND EIGENVALUES 13 



which gives us 



■■^-^ /i w) -^-^ h(w) 

wew^ ' ' weWL/D2L 

All that remains in proving ([B]) is to show that 

— c{w) -^—^ \w\ 



E 



h{w) -^ hiw) 

Specializing Lemma[TO]to Qtu = 1 shows that ^^^yy, c{w)/h{w) — J2w£W' \w\/h{w). 
Thus 

y^ \w\ - C{w) ^ yr _H__ y^ kl 

■^ h(w) ^ hiw) ^ hiw) 



E 



, h{wy 

establishing © and completing the proof. D 

From now on, we will consider the process (7Vu,(t), fc £ N, t > 0) to be run- 
ning under stationarity, i.e., with marginal distributions given by conclusion (jin)) 
of the last lemma. This process is easily constructed as described above, but with 
additional point masses of weight 1/hiw) for each w S W at (w,0) added to the 
intensity measure of Xj thus giving us the correct distribution at time zero. 

3.3. Time-reversed processes. Fix some time T > 0. We define the time- 
reversal 1v^(t) :=: N^{T ~t) ioiO<t<T. 

Lemma 12. For any fixed i G N, the process {(iV^(t), w £ Wi), < t < T} is 
a time-homogenous Markov process with respect to the natural filtration. A trivial 
modification at jump times renders RCLL paths. The transition rates of this chain 
are given as follows. Let u G yVk-il D2k-i o.'^^d w e Wkl D2k, o.'^^d suppose that u 
can he obtained from w by halving b different pairs. Let x — ix^^ w e W^). 
(i) The chain jumps from x to x + Cu — Cw with rate bxw ■ 
(a) The chain jumps from x to x — e^ with rate {k — c{w))xw 
(Hi) If w ^ Wl/-D2L; then the chain jumps from x to x + e^ with rate L/h{w). 

Proof. Any Markov process run backwards under stationarity is Markov. If the 
chain has transition rate r{x,y) from states x to y, then the transition rate of the 
backwards chain from a; to y is r(y, x)v{y)/v{x), where v is the stationary distribu- 
tion. We will let v be the stationary distribution from Lemma lllltiil and calculate 
the transition rates of the backwards chain, using the rates given in Lemma lllltil 

Let a denote the number of letters in u that give w when doubled. The transition 
rate of the original chain from x 4- e„ — Cu, to a; is a{xu + 1), so the transition rate 
of the backwards chain from a; to a; 4- e^ — e^, is 

, .sy{x + Ck-i.c-i - Ck^c) ah{w)x^ 

a{xu + 1) 1—^ — = , . , 

i'(x) h[u) 
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and this is equal to bxw by LeminalHl A similar calculation shows that the transition 
rate from x to a; — Cto is 

(fc - c{w))v{x - e^) 

, , , = (fc - c[w))Xyj, 

h(w)v{x) 

proving (jn]). The transition rate from x to a; + e^, for w £ Wl/ D2L is 



v{x) h(w) ' 

which completes the proof. D 

By definition, 

^ra{t)= Y. ^ {XuAT ~ t ~ S) ^ W} . 

{u,s)ex 

s<T-t 

We will modify this slightly to define the process 

k^(i) := Y^ 1 {XuAT - t- s) ^w and \Xu AT -s)\<L}. 

s<T~t 

The idea is that Mu,(i) is the same as Nu,{t), except that it does not count cycles at 
time t that had more than L vertices at time zero. The process (M„(i), w G W^) 
is a Markov chain with the same transition rates as {N^(t), w e W£), except that 
it does not jump from a; to a; + e^, for w G Wl/^2L- These two chains also have 
the same initial distribution, but (Afm(t), w G W£) is not stationary (in fact, it is 
eventually absorbed at zero). 

4. Process convergence 
Theorem 13. The process [Cw {■), w G W) converges in law as s -^ 00 to 

iNA-),weW'). 



Proof. The main difficulty in turning the intuitive ideas of Section [3. II into an actual 
proof is that (Cif (i), w G W") is not Markov. We now sketch how we evade this 

■<r- 

problem. We will run our chain backwards, defining G s{t) = G{s + T — t) for 
some fixed T > 0. Then, we ignore all of G s{0) except for the subgraph consisting 
of cycles of size L and smaller, which we will call rs(0). The graph V s{t) is the 
evolution of this subgraph as time runs backward, ignoring the rest of G s{t). Then, 
we consider the number of cycles with word w in rs(t), which we call 0^(rs(t)). 
Choose K <t^ L. Then 0„( rs(t)) is likely to be the same as Gw {T—t) for any word 
w with \w\ < K. The remarkable fact that makes (f)w\^ s{i)) possible to analyze 
is that if rs(0) consists of disjoint cycles, then ((/iu,(r s(i)), w G W^) is a Markov 
chain governed by the same transition rates as (Mu,(i), w G W£). 

Another important idea of the proof is to ignore the vertex labels in G s{t), so 
that we do not know in what order the vertices will be removed. Thus we can 
view G s{t) as a Markov chain with the following description: Assign each vertex 
an independent Exp(l) clock. When the clock of vertex v goes off, remove it from 
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the graph, and patch together the TTi-labeled edges entering and leaving v for each 
l<i<d. 

Step 1. Definitions of i^ s{t) and (/),„ and analysis of ((/)m( r ^(i)), w G W^). 

Fix T > and define G s{t) = G{s + T — t). As mentioned above, we will consider 
G s(t) only up to relabeling of vertices, which makes it a process on the countable 
state space consisting of all edge-labeled graphs on finitely many unlabeled vertices. 
With respect to its natural filtration, it is a Markov chain in which each vertex is 
removed with rate one, as described above. 

To formally define T s{t), fix integers L > K and let rs(0) be the subgraph of 
Gs(0) made up of all cycles of length L or less. We then evolve T s(0 i'^ parallel 
with G s{t). When a vertex v is deleted from G s{t), the corresponding vertex v in 
r s(t) is deleted if it is present. If v has a Tr^-labeled edge entering and leaving it in 
r s(i), then these two edges are patched together. Other edges in F s(i) adjacent to 
V are deleted. This makes T s(i) a subgraph of G ^{t), as well as a continuous-time 
Markov chain on the countable state space consisting of all edge-labeled graphs 
on finitely many unlabeled vertices. The transition probabilities of rs(i) do not 
depend on s. 

From Theorem II 7i we can find the limiting distribution of F s(0). Suppose that 
7 is a graph in the process's state space that is not a disjoint union of cycles. By 
Theorem mil 

lim P[t',(0)=7] = 0. 

Suppose instead that 7 is made up of disjoint cycles, with z^ cycles of word w for 
each w € W'l^. By Theorem 0111 

(7) lim P[t^.(0) = 7] = TT P[Z^ = z^l 

where {Zu,, w € W^) are independent Poisson random variables with EZ^ — 
l/h{w). Thus Fs(0) converges in law as s — > cx) to a limiting distribution sup- 
ported on the graphs made up of disjoint unions of cycles. For different values of 
s, the chains F ^(f) differ only in their initial distributions, and the convergence in 
law of rs(0) as s — > 00 induces the process convergence of {rs(i), < t < T} 
to a Markov chain { F (t) , < t < T} with the same transition rates whose initial 
distribution is the limit of F s(0). 

For any finite edge-labeled graph G, let (pwiG) be the number of cycles in G with 
word w. By the continuous mapping theorem, the process (</)„,( rs(t)), w E W^) 
converges in law to {(^^[r (t)), w G W£) as s ^ 00. 

We will now demonstrate that this process has the same law as (Mw(i), w £ 
yV'i). The graph F (i) consists of disjoint cycles at time < = 0, and as it evolves, 
these cycles shrink or are destroyed. The process (0„(F (i)), w € W^) jumps 
exactly when a vertex in a cycle in F (i) is deleted. If the deleted vertex lies in a 
cycle between two edges with the same label, the cycle shrinks. If the deleted vertex 
lies in a cycle between two edges with different labels, the cycle is destroyed. The 
only relevant consideration in where the process will jump at time t is the number 
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of vertices of these two types in F (i), which can be deduced from ((/)«,( F (i)), w € 
yV^). Thus this process is a Markov chain. 

Consider two words u,w G W^ such that w can be obtained from u by doubhng 
a letter. Suppose that u can be obtained from w by halving any of b pairs of letters. 
Suppose that the chain is at state x = (x„, v e W£). There are bx^ vertices that 
when deleted cause the chain to jump from x to a; — e^, + e„, each of which is 
removed with rate one. Thus the chain jumps from x to x — e^ + Su with rate bx^. 
Similarly, it jumps to a; — e^, with rate {\w\ — c{w))xw. These are the same rates as 
the chain {M^{t), w € W^) from Section [231 The initial distribution given by ([7]) 
is also the same as that of (Mm(t), w £ W£), demonstrating that the two processes 
(0^(r (t)), w G >V£) and (M„(i), w € >V£) have the same law. 



Step 2. Approximation of Cw (t) by (f)m{T s{t)). 

We will compare the two processes {{C w (t), w € W^), < i < T} and 

{(</'u)( r sit)), w € yV'x), < i < T} and show that for sufficiently large L, they are 
identical with probability arbitrarily close to one. 

Consider some cycle in Gsit); we can divide its vertices into those that lie 
between two edges of the cycle with different labels, and those that lie between 
two edges with the same label. We call this second class the shrinking vertices of 
the cycle, because if one is deleted from G s{t) as it evolves, the cycle shrinks. We 
define Es{L) to be the event that for some cycle in G s{0) of size I > L, a,t least 
I — K oi its shrinking vertices are deleted by time T. 

We claim that outside of the event Es{L), the two processes {{G w {t), w £ 
W^), < t < T} and {((/-^(^^^(i)), w G W^), < < < T} are identical. Suppose 
that these two processes are not identical. Then there is some cycle a of size K or 
less present in G s{t) but not in T s{t) for < i < T. As explained in Section [H 
as a cycle evolves (in forward time), it grows into an overlapping cluster of cycles. 
Thus Gs(0) contains some cluster of overlapping cycles that shrinks to a at time 
t. One of the cycles in this cluster has length greater than L, or the cluster would 
be contained in T s(0) and a would have been contained in F s(i). 

To see that / — K shrinking vertices must be deleted from this cycle, consider 
the evolution of a into the cluster of cycles in both forward and reverse time. If 
a vertex is inserted into a single edge of a cycle in forward time, we see in reverse 
time the deletion of a shrinking vertex. If a vertex is simultaneously inserted into 
two edges of a cycle, causing the cycle to split, we see in reverse time the deletion 
of a non-shrinking vertex of a cycle. As a grows, a cycle of size greater than L can 
form only by single-insertion of at least I — K vertices into the eventual cycle. In 
reverse time, this is seen as deletion oil — K shrinking vertices. This demonstrates 
that Es{L) holds. 

We will now show that for any e > 0, there is an L sufficiently large that 
P[i?s(i)] < e for any s. Let w e yV;/£'2i with I > L, and let / C [I] such that 
|7| = I — K and Wi = Wi-i for all i € I, considering indices modulo I. For any cycle 

in G s{0) with word I, the set / corresponds to a set oi I ~ K shrinking vertices of 
the cycle. 

We define F{w, I) to be the event that Gs{0) contains one or more cycles with 
word w, and that the vertices corresponding to / in one of these cycles are all 
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deleted within time T . By a union bound, 

(8) nEs{L)]<Y,V[F{wJ)]. 

W,I 

We proceed by enumerating all pairs of w and /. For any pair w, /, deleting the 
letters in w at positions given by / results in a word u S V^k / D2k- For any given 
M = Ml • • • uk G Wk/D2k, the word w € W1/D21 must have the form 

w = ui ■ ■ ■U1U2 ■ ■ ■U2 Uk ■ ■ ■ Uk, 

ai times a2 times ax times 

with flj > 1 and ai + ■ ■ ■ + gk = I- The number of choices for ai, . . . , uk is {k^_i) , 
the number of compositions of I into K parts, and each of these corresponds to a 
choice of w and /. There are fewer than a{d,K) choices for m, giving us a bound 
of a{d, K) {k^_i) choices of pairs w and / for any fixed / > L. 
Next, we will show that for any pair w and / with |u)| = I, 

(9) P[Fiw,I)]<{l-e-^y-^. 

Condition on G s(0) having n vertices. Consider any of the [n]i possible sequences of 
I vertices. Choose some representative w' € W; of w. For each of these sequences, 
the probability that it forms a cycle with word w' is at most l/[n]i (recall the 
original definition of our random graphs in terms of random permutations). Given 
that the sequence forms a cycle, the probability that the vertices of the cycle at 
positions / are all deleted within time T is (1 — e~^)'~^. Hence 

P \f(w,I) I t? JO) has n vertices] < \n]i-^(l - e'^)^-^ , 
L J \n\i 

<(l-e-^)'-^. 

This holds for any n, establishing ([9]). 
Applying all of this to (|S]) , 

v\E,m< E a(d,i^)Qr_M(i-e-^)'~'=. 

This sum converges, which means that for any e > 0, we have P[i?s(i)] < e for 
large enough L, independent of s. 

Step 3. Approximation o/]V„,(t) by M.w{t). 

Recall that we defined the processes {(M^lt), w € W^), < t < T} and 
{(Nwit), w e W^), < i < T} on the same probability space. We will show 
that for sufficiently large L, the two processes are identical with probability arbi- 
trarily close to one. 

By their definitions, these two processes are identical unless one of the processes 
Xu,s{') started at each atom of x grows from a word of size K or less to a word of 
size L + 1 before time T; we call this event E{L). Let 

Y^\{{u,s)&X-- \u\<K,3<T]\, 

the number of processes starting from a word of size K or less before time T. 

Suppose that X{-) has law Pw for some word w € 'Wk/D2k- We can choose L 
large enough that P[|X(r)| > L] < e for all k<K. Then P[£^(L) \Y]<eY by 
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a union bound, and so P[E{L)] < eEY. Since EY < cxi, we can make P[E{L)] 
arbitrarily small by choosing sufficiently large L. 

Step 4. Weak convergence of {{CiS\t), w G W^), Q < t < T} to {(1v^(t), w G 
>V^),0<i<T}. 

If two processes are identical with probability 1 — e, then the total variation 
distance between their laws is at most e. Thus, by steps 2 and 3, we can choose L 
large enough that the laws of the processes {{C w {t), w e W|^), < t < T} and 
{(0m( r s(i), w £ VV^), < t < T)} are arbitrarily close in total variation distance, 
uniformly in s, and so that the laws of {(Mu,(t), w G W^), < t < T} and 
{(JSfiuit), w e W^), < i < T}} are arbitrarily close in total variation distance. 
Since total variation distance dominates the Prokhorov metric (or any other metric 
for the topology of weak convergence) , we can choose L such that these two pairs 
are each within e/3 in the Prokhorov metric. Since {((/)^( F s(t)), w G W^), < 

t < T} converges in law to {(M^(f), w G W^), < f < T} as s ^- cx), there 
is an So such that for all s > sq, the laws of these processes are within e/3 in 
the Prokhorov metric. We have thus shown that for every e > 0, the laws of 
{(tj^f^it), w G W^), < i < T} and {(lv™(i), w G W'j^), < t < T} are within e 
for sufficiently large s, which proves that the first random vector converges in law 
to the second as s — >■ oo. 

Step 5. Weak convergence of{{ci'\t), w G W), t>0} to {(N^it), w G W), t > 
0}. 

It follows immediately from the previous step that {{Cw (t), w G W^), < 
t < T} converges in law to {(A^fe(t), w G W^), < i < T} for any T > 0. 
By Theorem 16.17 in [Bil99] . {{ci'\t), w G W^^), t > 0} converges in law to 
{(7V^(t), w G W^), i > 0}, which also proves that {{cL'\t), w e W').,t > 0} 
converges in law to {{N^(t), w G W), t>0}. D 

Proof of TheoremUi We now consider the case of short cycles in the graph. We 
will express these as functionals of (Cif (i), w G W"). For example, consider the 
count of cycles of size k €N. Then Cff (t) — J2wew ID C'if (t) is the number of 
/c-cycles in G{s + t), and let 

weWk/D2k 

It follows immediately from the continuous mapping theorem that {(C*^* (i), k G 
N), t > 0} converges in law to {{Nk{t), fc G N), i > 0} as s -> oo. 

It is not hard to see that this limit is Markov and admits the following rep- 
resentation: Cycles of size k appear spontaneously with rate J2w£W./d m(^)- 
The size of each cycle then grows as a pure birth process with generator Lf{i) = 
* (/(^ + 1) ^ /(*))• The only thing we need to verify is that 

(10) Y. ^H= E ^^^ ^{a{d,k)-a{d,k -l))/2. 

uieWk/D2k u](£Wk/D2k ^^^ 
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However, this follows from Lemma [10] in the following way. From that lemma 
we get 



Thus 



h{w) ^—f h{w) 



^-^ ^-^ h(w) ^-^ h(w) 

w(£Wk/D2k weWk/D2k ^ ' w£Wk-i/D2(k-i) 

However, the two terms on the right side of the above equation are simply half the 
total number cyclically reduced words possible, of size k and k — 1 respectively. 
The total number of cyclically reduced words of size k on an alphabet of size d is 
a{d, k) (see Appendix of |DJPP11] ). This shows ([TUl) and completes the proof. D 

We end with the following corollary. 

Corollary 14. For any s < t and j, k £ N, one has: 

lO, otherwise. 

Proof. We will refer to the Yule processes counted by Nk(t) as cycles of length 
k present at time t, even though these "cycles" in the limiting process have no 
connection to graphs. If fc < j, every cycle that is of length j at time s cannot 
grow to a cycle of of length k at time t. Thus, Nk{t) depends on cycles that are 
independent of those that make up Nj{s). Hence Nk{t) is independent of Nj{s). 
^i k > j, notice that one has the following decomposition: 

k 

(11) 7Vfc(i)=E"0'^)^J-(^) + ^' 

where a(j, k) is the proportion of one-dimensional pure-birth Yule processes that 
were at state j at time s and grew to state k at time t, and Z is a random vari- 
ables that counts the number of new births in the time interval {s,t) that grew to 
state A; at time t. Note that, under our invariant distribution all random variables 
{Z, Nj{s), 1 < j < k} are independent of one another. Thus, our conclusion follows 
once we show 

(12) Ea(j,fc)= Q^^^^p^(l-p)^-^ p^e^-K 

The expected proportion Ea(j, fc) is the probability that a one-dimensional pro- 
cess Xj^k, with law of an Yule process starting at j, is at state k at time {t — s). If 
^j, . . . ,^k are independent exponential random variables with rates j, . . . ,k, then 

Ea(j, fc) = P [{Cj + . . . + 6-1 <t-s}n{^,+...+^k>t- s}] 

We now use the Renyi representation: suppose Yi,Y2, . . . ,Yk are iid Exp(l) random 
variables. Define the order statistics Yj-i) > Y(2) ^ • ■ • > ^(fc)- Then, the following 
equality holds in distribution 

(Y(,) - y(,+i), j<i<k)= (6, j<i<k). 

Here we have defined Ytk^i\ = 0. Thus, in distribution, 

e^ + • • ■ + 6-1 = i"0) - i"(fc) , e, + • ■ • + 6 = i"o) • 
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Thus 

Ea(j, k)^P{t-s< y(j) < y(fe) +t-s). 

Note that, by an elementary symmetry argument, for any u > {t — s), we have 

P [Y(j) e (u, M + du), y(j) - Y^k) <t-s\ 

— VlYi—u for some i, exactly j — 1 of Yi, . . . , Y^ are greater than u, 

and the rest of Yi , . . . , Yfe are in [u — t -\- s. u\\ Au 

= fee-" {^. 1 J] e-(^~i)" [e-''+*-^ - e""] '"^ du 

^:J)e--(e--l)-^d.. 
Integrating out u in the interval (i — s, oo) we get 



/7 1 \ /"OO 

P [t - 5 < y(,) O'c/c) + t - 5] = f r J (e*-^ - 1) '^' j ke-^^du 

This shows (fT2|l and completes the proof of the corollary. D 

4.1. Two-dimensional convergence. So far, we have considered d as a constant. 
We now view it as a parameter of the graph and allow it to vary. Recall that 
(Hrf, d G N) are independent towers of random permutations, with n^ = ij^d \ " G 
N), and that G{n,2d) is defined from ttJ" , . . . , tt^j"' . For each d, we follow the 
construction used to define G{t) and construct G2d{t), a continuous-time version of 
{G{n, 2d), n e N). Let H"(d) be the set of equivalence classes of cyclically reduced 
words as before, with the parameter d made explicit. Define G^^l.{t) as the number 
of A;-cycles in G2d{s + t) and consider the convergence of the two-dimensional field 
{(C^^](t), d, fc e N), t > 0} as s ^ CX3. 

Again, we will consider this process as a functional of another one. Define 
>V'(oo) = U^i W(d), noting that W(l) C W(2) C • • • . For any u; G >V'(d), the 
number of cycles in G2d' {s + t) with word w is the same for all d' > d. We define 
Cw {t) by this, so that 

wew'{d) 

|tjj|— fc 

Then we will prove convergence of {{Cw {t), w € yV"(oo)), i > 0} as s — ?► 00. 

To define a limit for this process, we extend /i to a measure on all of W'(oo) and 
define the Poisson point process x on >V'(oo) x [0, 00). The rest of the construction is 
identical to the one in Section [5T^ giving us random variables {Nui{t), w S W(oo)) . 

Theorem 15. The process (Cif (■), w e yV'((X))) converges in law as s —> 00 to 

(7Y^(-), we W'M). 

Proof. It suffices to prove that (Cif (•), w G yV'{d)) converges in law as s — >■ 00 to 
{N^{-), w e W'(d)) for each d, which we did in Theorem [T3l D 
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Proof of Theorem Let 

wew'(d) 

\w\—k 

By the continuous mapping theorem, {Nd^k{-), d, fc e N) is the limit of (C^ fe(-)i d,k € 
N) as s — )■ oo. 

Let us now describe what the hmiting process is. It is obvious that (-/Vd.fc(')i ^ ^ 
N, d e N) is jointly Markov. For every fixed d, the law of the corresponding 
marginal is given by Theorem [T] To understand the relationship across d, notice 
that cycles of size fc for (d + 1) consist of cycles of size k for d and the extra ones 
that contain an edge labeled by 7r£;_|_i of "^^+1- Thus 

Nd+i.kit) ~ Nd.k{t) = Y. ^-W 

weW'{d+l)\W'(d) 

\w\—k 

This process is independent of (iVj ., i G [d]), since the set of words involved are 
disjoint. Moreover, the rates for this process are clearly the following: cycles of size 
k grow at rate k and new cycles of size k appear at rate [a{d + 1, fc) — a{d + 1, fc — 
1) — a{d, k) + a{d, k — l)]/2. This completes the proof of the result. D 

5. Process limit for linear eigenvalue statistics 

Let us recall some of the basic facts established in [DJPPlll Section 3, 5] that 
connect linear eigenvalue statistics with cycle counts. A closed non-backtracking 
walk is a walk that begins and ends at the same vertex, and that never follows 
an edge and immediately follows that same edge backwards. If the last step of a 
closed non-backtracking walk is anything other than the reverse of the first step, we 
say that the walk is cyclically non-backtracking. Cyclically non-backtracking walks 
on Gn are exactly the closed non-backtracking walks whose words are cyclically 

(n) 

reduced. Let CNBW^ denote the number of closed cyclically non-backtracking 
walks of length k on Gn- 

Cyclically non-backtracking walks are useful because they can be computed as 
linear functionals of a graph's eigenvalues. Let {T„(a;)}„gN be the Chebyshev poly- 
nomials of the first kind on the interval [—1,1]. We define a set of polynomials 

ro(x) = 1 , 

2d— 2 

r2k{x) = 2T2k{x) + j^^-rjy: , V /c > 1 , 

r2k+i{x) = 2T2k+i{x) , Vfc>0. 

Let An be the adjacency matrix of Gn, and let Ai > • • • > A„ be the eigenvalues 
of (2d-l)-i/2A„/2. Then 

n 

(13) Y^k{X^) = {2d-ly''/^CNBWi"\ 

1=1 

Now, for any cycle in G„ of length j\k, we obtain 2j non-backtracking walks of 
length k by choosing a starting point and direction and then walking around the 
cycle repeatedly. In [DJPPll] Corollary 18], it is shown that with certain conditions 
on the growth of d and r, all cyclically non-backtracking walks of length r or less 
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have this form with high probability. Thus the random vectors (CNBW^" , 1 < 

fc < r) and (X^iife^j^fe ' 1 < k < r^ have the same hmiting distribution, and 
the problem of finding the limiting distributions of polynomial linear eigenvalue 
statistics is reduced to finding limiting distributions of cycle counts. We will prove 
Theorem [5] by arguing that this holds for the entire process {G{t), i > 0). 

Call a cyclically non-backtracking walk bad if it is anything other than a repeated 
walk around a cycle. 

Proposition 16. Fix an integer K. There is a random time T , almost surely 
finite, such that there are no bad cyclically non-backtracking walks of length K or 
less in G{t) for all t > T. 

Proof. We will work with the discrete-time version of our process (G„, n € N). 
We first define some machinery introduced in |LP10j . Consider some cyclically 
non-backtracking walk of length k on the edge-labeled complete graph Kn of the 
form 

Wi W2 W3 Wk 
So > Si > S2 > • • • > Sfc = Sq. 

Here, Si G [n] and Wi is tt^ or tt^ for some i, indicating which permutation provided 
the edge for the walk. We call w the word of the walk, following the notation for 
cycles. Note that the word of a cyclically non-backtracking walk is is cyclically 
reduced. We say that G„ contains the walk if the random permutations tti , . . . , tt^ 
satisfy Wi{si^i) = Si. In other words, G„ contains a walk if considering both as 
edge-labeled directed graphs, the walk is a subgraph of G„. 

If (si, < i < fc) is another walk with the same word, we say that the two walks 
are of the same category if Si ~ Sj <;==> s^ — s' . In other words, two walks are of 
the same category if they are identical up to relabeling vertices. The probability 
that Gn contains a walk depends only on its category. If a walk contains e distinct 
edges, then G„ contains the walk with probability at most l/[n]e. 

(n) 

Let Xj, be the number of bad walks of length k in G„ that start at vertex n. 

(n) 

We will first prove that with probability one, Xj^ > for only finitely many n. 
Call a category bad if the walks in the category are bad. Let Tk,d be the number of 
bad categories of walks of length k. For any particular bad category whose walks 
contain v distinct vertices, there are [n — l]„_i walks of that category whose first 
vertex is n. Any bad walk contains more edges than vertices, so 

T^vin) , 7fc,d[n- l]t,-l . Tk,d 

[n\v+i n[n - k) 

Since X^^' takes values in the nonnegative integers, PlXjJ^' > 0] < EX|,"\ By the 

(n) 

Borel-Cantelli lemma, Xj^ > for only finitely many values of n. 

Thus, for any fixed r + 1, there exists a random time N such that there are no 
bad walks on G„ of length r -I- 1 or less starting with vertex n, for n > N. We 
claim that for n > N, there are no bad walks at all on G„ with length r or less. 
Suppose that Gm contains some bad walk of length k < r, for some m > N. As 
the graph evolves, it is easy to compute that with probability one, a new vertex is 
eventually inserted into an edge of this walk. But at the time n > m > N when 
this occurs, G„ will contain a bad walk of length r -I- 1 or less starting with vertex 
n, a contradiction. Thus we have proven that G„ eventually contains no bad walks 
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of length r or less. The equivalent statement for the continuous-time version of the 
graph process follows easily from this. D 

Proof of Theorem [31 Let CNEWj^,'' it) denote the number of cyclically non-backtracking 
walks of length k in G{s + t). We decompose these into those that are repeated 
walks around cycles of length j for some j dividing k, and the remaining bad walks, 
which we denote B)^' (t) , giving us 

Proposition [16] implies that 

lim P[B'j^\t) = for a[\k<K,t>0]= 1. 

By Theorem [T] together with the continuous mapping theorem and Slutsky's theo- 
rem and, as s tends to infinity, 

(14) (CNBW^'^(-), l<k<K)^ ( ^2jiVj(-), 1 < k < k' 

Now, we modify the polynomials F^. to form a new basis {fk, k £ N} with the 
right properties, which amounts to expressing each Nk{t) as a linear combination 
of terms J^jii '^J-^ji'^)- We do this with the Mobius inversion formula. Define the 
polynomial 

where /i is the Mobius function, given by 

I (—1)" if n is the product of a distinct primes, 
I otherwise. 

The theorem then follows from ([T^ . ([H]). and the continuous mapping theorem. D 



Proof of Theorem\d[ We start by recalling that, for any fixed d, 

2d ~ 2 
2 tr T, (G(oo + t)) + n————-l {i is even} = {2d - 1)-'/^ V 2kNk{t) 

To prove the Gaussian convergence consider two time points s < t and two 
positive integers i,k. We will first show that, for any i,k £ N, the pair {{2d — 
l)-*/2(iV,(s) - E7V,(s)), {2d - l)-'=/2(7Vfc(t) - EiVfc(i))) converges to Gaussian in 
distribution as d tends to infinity. When s — t, this trivially follows via the Central 
Limit Theorem and their independent Poisson joint distribution. 

When s < t, observe from (TTTI) that 

k 

^fc(i)=^a(j,fc)iV,(s) + Z. 

i=i 

Here a{j,k)Nj{s), j G [k], and Z are independent Poisson random variables of 

various means. Moreover Z is independent of the history of the process till time 

s. Under the stationary law, the vector {Nj{s), j € N) are independent Poisson 
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random variables. Thus, if i > fc, then Ni(s) is independent of Nk{t). Otherwise, by 
the thinning property of Poisson, a{i, k)Ni(s) is independent of (1 — a{i, k))Ni{s). 
Therefore, Nk{t) — a{i, k)Ni{s), a{i, k)Ni{s), and (1 — a{i, k))Ni{s) are three inde- 
pendent Poisson random variable. 

By the normal approximation to Poisson, we get the appropriate distributional 
convergence to corresponding independent Gaussian random variables. This shows 
the joint convergence of {Ni{s) , Nk{t)) to Gaussian after centering and scaling. 

A similar Gram-Schmidt orthogonalization can be carried out for the case of time 
points ti < t2 < ■ ■ ■ < tm and corresponding positive integers ji,J2, ■ ■ -jm- This 
proves the joint Gaussian convergence of any finite collection of {Nj.{ti), i € [to]) 
under centering and suitable scaling. Since the traces of Chebyshev polynomials are 
linear combinations of coordinates of N, the joint Gaussian convergence extends to 
them by an argument invoking the Continuous Mapping and Slutsky's theorems. 

For a fixed d, the covariance computation follows from Corollary [TJ] and (IT51) . 
Hence, if s < t, then 
(15) 

Cov(trr, {G{oo + t)),tYTj (G(oo + s))) = ^ (2d - 1)"^'+^'^/^ ^ Alk Cov {Nk{t),Ni{s)) 

k\i., i\j 



'^-^'- ■ '"' 4^-- -' ^ 



Here 

cov,».,o.A.,(.))=j„^(p'':''-''>'-' "'-'' "^'- 

I 0, otherwise. 

We now fix any i, j, i, s and taking d to infinity. Any term a(d, r) is asymptotically 
the same as (2d— 1)^. Thus the highest order term (in d) on the right side of p^ 
is (2ci — 1)™™(*'J). Unless i = j, this term is negligible compared to (2d — l)('+-')/2. 
This shows that the limiting covariance is zero unless i ^ j ■ 

On the other hand, when i = j, every term on the right side of (1151) vanishes, 
except when k ^ i = I = j. Hence, 

lim Cov(trT, (G(oo + i)),trT, (G(oo + s))) = -2ip' ^ -e*("-*). 

d— >oo 4 2 

Finally we prove the process convergence. The Gaussian convergence shown 
above already shows finite dimensional convergence to the stated Ornstein-Uhlenbeck 
process. One simply needs to argue tightness. 

Fix a. K E N and, for every d, consider the process 

(Xfe(f) := (2d - l)-^/2 {2kNkit) - aid, fc)) , fc G [if], i > o) . 

We claim that it suffices to show tightness for this process. This follows, since then, 
due to unequal scaling, the difference between this process and the centered and 
scaled traces go to zero in probability as d tends to infinity. 

To show tightness of X, note that, by [EK861 Chapter 11, Problem 22 (c)], it 
suffices to show tightness for each of the individual one-dimensional processes 

{ATfe, /ceN} and {Xk + Xi, k,l eN} . 

For each of these one-dimensional processes, it is enough to show tightness in 
D[0,T] (take T = 1, without loss of generality) which we will show using |Bil99[ 
Theorem 13.5] for /3 = l,a = 1. 
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Consider Xk ■ We already know that it has finite-dimensional convergence to the 
Ornstein-Uhlenbeck process. Now, for any r < s < i, we want to estimate 



E 



\2 /T^ /.N T^ / Nn2 



16A: 



4 



iXkis) - Xk{r)r {Xk{t) - Xu{s)Y = ^^^^YpE {Nk{s) - Nk{r)Y {Nk{t) - Nk{s)y 



\2 /AT rj.\ AT / \\2 



We claim that the above is 0{t — r)^ as d tends to infinity. 

To see this, first use the elementary inequality 2ah < a? + h^ to obtain 

E [(Xk{s) - Xk{r)f (Xfe(i) - Xk(s)f] < i [e [Xkis] - Xk{r)t + E (Xfe(i) - Xu{s)f 

Consider E {Xk{t) — Xk{s)) , which can be written as 

i2d-l)-^''{2kf-EiNkit)-Nkis))\ 

From the orthogonal decomposition described in the beginning of this proof, one can 
write Nk{t) — Nk{s) as the difference of two independent Poisson random variables 
Zi — Z2 with the same mean v = (1 — r{k, k))a{d, k)/2k, where r(fc, k) — Ea(fc, k). 
Thus 

E (Nkit) - Nk{.s)f = E [{Zi -v)- {Z2 - v)t 

= E(Zi - vf - 4E(Zi - vf'E{Z2 -v)+ 6E(Zi - vf'E{Z2 - vf 

- 4I]{Zi - iy)'E{Z2 - T^f + E(Z2 - J^)^ 

= 2iy{l + 3i^) + 61^^ = 2iy(l + 6iy). 

Since 1/ = (1 - eM''-^'>)a{d, k)/2k, we get 

(2d- l)-2fe(2fc)'E(7Vfe(i) - Nk{s)f < C^(l - e^-(^-*))2 < Ck{t - s)\ 

Here Cfc, C'f. are constants depending on k. 

Similarly E{Xk{s) — Xk{r))'^ < Ck{s ~ r)^. Combining the two parts we get 

E \{Xk{s) - Xk{r)f {Xk{t) - Xk{s)f\ < Ck{t - r)^. This allows us to use [Bil99l 

Theorem 13.5, eqn. (13.14)] to show tightness of Xk- 

A similar argument for the pairs Xk + Xi proves our claim and completes the 
proof of the theorem. D 

Appendix: A broad Poisson approximation result 
The following facts were used in the proof of Theorem [T^l 

Theorem 17. Let {Z^, w £ W^} be a family of independent Poisson random 
variables with EZ^ = l/h{w). For any fixed integer K , 

(i) as t ^ 00, 

(ii) as t ^ 00, the probability that there exist two cycles of length K or less sharing 
a vertex in G{t) approaches zero. 

We defer the proof to the end of the section. We will in fact establish a far more 
general theorem. 
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Theorem 18. Let Gn — G{n,2d), a 2d-regular random graph on n vertices from 
the permutation model. For any k, let Xk be the set of all cycles of length k on the 
complete graph K„ with edge labels that form, a cyclically reduced word; these are 
the possible k-cycles that might appear in Gn. LetX = ljj,^]^2^fc for some integer r. 
For any cycle a G I, let la = 1{G„ contains a}, and let I = (/„, a G I). Let 
Z = {Za, a e I) be a vector whose coordinates are independent Poisson random 
variables with EZq, ~ 1/[?t.]a: for a € Ik. Then for all d > 2 and n,r > 1, 

.„-(i,z)< "^-'-""" 

n 

for some absolute constant c, where drvi^TY) denotes the total variation distance 
between the laws of X and Y. 

Theorem 11 in [DJPPll] gave a similar Poisson approximation for the vector 
of cycle counts. As the vector of cycle counts is a functional of I, Theorem [TSl 
immediately implies this theorem; we give the details in Corollarv l23tl Our theorem 
also improves the total variation bound from 0{{2d— l)'^^ /n) to 0{{2d— l)^''^^/n). 
We conjecture that Theorem [18] is sharp. 

As in the proof of Theorem 11 in [DJPPll] . the main tool is the the Stein-Chen 
method for Poisson approximation by size- biased couplings as described in [BHJ92J , 
which uses the following idea: For each a £ I, let {Jpa, /3 G I) be distributed as 
{Ip, /3 G X) conditioned on /q = 1. The goal is to construct a coupling of (J^g, /3 e X) 
and {J Pa, /3 e I) so that the two random vectors are "close together". We hope 
that for each a G I, the cycles in I \ a can be partitioned into two sets X~ and T+ 
such that 

(16) Jpo.<lp if/3el-, 

(17) Jpo.>Ip if/3el+. 

If this is the case, then one can establish a Poisson approximation by calculating 
Cov(/q,/^) for every a,/3 CzX, according to the following proposition. 

Proposition 19 (Corollary lO.B.l in |BHJ92| ). Suppose that I = (/„, a E X) is 
a vector of 0-1 random variables with EJq, = pa. Suppose that {Jpa, (3 € X) is 
distributed as described above, and that for each a there exists a partition and a 
coupling of (J/Jq, (i EX) with (/^, /3 E X) such that (J16p and (J17p are satisfied. 

Let Y = {Ya, a E X) be a vector of independent Poisson random variables with 
"EYa = Pa ■ Then 

(18) dTvil,Y)<J2pl + T. E |Cov(/„,/^)| + ^ ^ Cov(/„,/^). 

We introduce two lemmas, whose proofs we will defer to the end of the appendix. 
The first will let us approximate I by Z rather than by Y, and the second provides 
a technical bound that we need. 

Lemma 20. Let Y — (Y^, a E X) and Z = [Za, a E X) be vectors of independent 
Poisson random variables. Then 



dTviY,Z)<J2\^Yo.-'EZa\ 



aex 
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Lemma 21. Let a and b be d- dimensional vectors with nonnegative integer com- 
ponents, and let (a, b) denote the standard Euclidean inner product. 

-'■-'- [n]ai+bi r"--"^ NaiWb, " 1 Mat+b, 

^_2 L J I I I ^_-|^ L J I L J 1 ^_]^ L J 1 I 1 

Proof of Theorem \18[ We will give the proof in three sections: First, we make the 
coupling and show that it satisfies (|16p and p7)) . Next, we apply Proposition [T^ to 
approximate I by Y, a vector of independent Poissons with EYq = E/q. Last, we 
approximate Y by Z to prove the theorem. 

If d > n^^"^ or r > n}^^^, then c(2d— l)^''~^/n > 1 for a sufficiently large choice 
of c, and the theorem holds trivially. Thus we will assume throughout that d < n^/^ 
and r < n^/^^ (the choice of 1/10 here is completely arbitrary). The expression 
0{f{d, r, n)) should be interpreted as a function of d, r, and n whose absolute value 
is bounded by C'f{d, r, n) for some absolute constant C, for all d, r, and n satisfying 
2<d< ni/2 and r < n^/io. 

Step 1. Constructing the coupling. 

Fix some a G X. We will construct a random vector (J^q, (3 & I) distributed as 
{Ip, /3 G X) conditioned on /„ = 1. We do this by constructing a random graph G^ 
distributed as G„ conditioned to contain the cycle a. Once this is done, we will 
define Jp^ — 1{G^ contains cycle /?}. 

Let TTi , . . . , TTd be the random permutations that give rise to G„ . We will alter 
them to form permutations ttJ, . . . , tt^, and we will construct G^ from these. Let 
us first consider what distributions ttJ^, . . . , tt^ should have. For example, suppose 
that a is the cycle 

1 > 2 < 3 > 4 > 1. 

Then tt'j^ should be distributed as a uniform random n-permutation conditioned to 
make 7r5^(3) — 2 and 7r5^(4) = 1, and ttj should be distributed as a uniform random 
n-permutation conditioned to make 7r3(l) — 2 and 7r3(3) = 4, while ^2 should just 
be a uniform random n-permutation. A random graph constructed from ttJ^, ttj, 
and TTg will be distributed as G„ conditioned to contain a. 

We now describe the construction of ttJ, . . . , tt^. Suppose a is the cycle 

(19) So si S2 ■ • • Sk = So, 

with each edge directed according to whether Wi{si^i) — Si or Wi{si) = Si_i. Fix 
some 1 < I < d, and suppose that the edge-label tt; appears M times in the cycle 
a. Let (a„i, &,„) for 1 < m < M be these directed edges. We must construct tt; to 
have the uniform distribution conditioned on n'l^am) — 6m for 1 < m < M. 

We define a sequence of random transpositions by the following algorithm: Let 
Ti swap 7r;(ai) with &i. Let T2 swap ri7ri(a2) with 62, and so on. We then define 
TT; = Tjv/ ■ ■ ■ TiTTi. This pcrmutation satisfies 7r;(am) = &„ for 1 < ?7i < M, and it 
is distributed uniformly, subject to the given constraints, which can be proven by 
induction on each swap. We now define G^ from the permutations tt^ , . . . , tt^ in the 
usual way. It is defined on the same probability space as G„, and it is distributed 
as Gn conditioned to contain a, giving us a random vector (J;3q, j3 £ I) coupled 
with (J/3, /3el). 
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Now, we will give a partition I^UX+ = I\{a} satisfying (fT6|) and (fT7|). Suppose 
that Gn contains an edge Si — ''''^ > v with v ^ s^+i, or an edge v — ''*''" > s^+i 
with V ^ Si. The graph G^ cannot contain this edge, since it contains a. In fact, 
edges of this form are the only ones found in G„ but not G'^ : 

Lemma 22. Suppose there is an edge i — — — > j contained in Gn but not in G^. 
Then a contains either an edge i — — — > v with v ^ j, or a contains an edge 
V — — — > j with V ^ i. 

Proof. Suppose iTi{i) = j, but 7r;(i) ^ j. Then j must have been swapped when 
making tt;, which can happen only if ni^am) = j or 6„i = j for some m. In the first 
case, Om = i and a contains the edge i — — — > &„ with bm 7^ j, and in the second 
a contains the edge «„ — — — > j with Om 7^ *■ D 

Define I~ as all cycles in I that contain an edge Si — ''''^ > v with v ^ s^+i or 
an edge v — '"'"'" > Si+i with w ^ s^, and define 1+ to be the rest of Z\ a. Since G'j 
cannot contain any cycle in X~ , we have J^a = for all /3 G I~ , satisfying (1161) . 
For any /3 € I^ , Lemma [52] shows that if /3 appears in G„ , it must also appear in 
G'^. Hence J^q > /^, and pT|) is satisfied. 

Step 2. Approximation ofl by Y . 

The conditions of Proposition [TOl are satisfied, and we need only bound the sums 
in (fT5|) . Let p^ = E/q, the probability that cycle a appears in G„. Recall that 
this equals ni=i l/Mci; where e^ is the number of times tt^ and tt^^ appear in the 
word of a. This means that 

(20) ^<Pa<J^, 

n" [n\k 

where k = \a\, the length of cycle a. 
We bound the first sum in (fT8|) by 



n\ka{d, k)\ / 1 



a6l fc=la6lfc k=l aeXk '" ""= 



E 



2fc 

k=l 



Ml 



<-t'-^^^-o{'^ 



fc=l 



To bound the second sum in (jlSp . we investigate the size of T^ . Suppose that 
a € Ik, and a has the form given in (jl9[) . Any /3 G X^T must contain an edge 
Si — '^^ ) w with ti 7^ Si+i, or an edge w — ''*''" > s^+i with v ^ Si, and there 
are at most 2k{n — 1) edges of this form. For any given edge, there are at most 
[n— 2] j_2(2(i— l)-'"'^ cycles in X,- that contain that edge, for any j > 2. Thus for any 
a S Ik, the number of cycles of length j > 2 inl~ is at most 2fc[n— l]j_i(2(i— 1)^~^, 
and this bound also holds for j = I. 
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For any /3 e Z^, , it holds that E[/q/^] — 0, so that Cov(/c<,/^) — —paPp- 
Putting this aU together and applying ([20l) . we have 



y: e icov(/„,/,)i =e e e e p^pf> 

^El^^-li;^EI^.-nx,|Jj- 
fc=i ^ J*" j=i ^ '^ 

^ ^ a{d, k) ^ 2fc(2d - Ip-i 



2fc ^^^ n 

fc=i j=i 



The final sum in (|18p is the most difficult to bound. We partition 1+ into sets 
1+ = Z° U • • • U Xa ' , where X^ is all cycles in 1+ that share exactly / labeled 
edges with a. For any /3 g Z+, 



1 
Ei/a/fl] = PiG contains a and /31 = TT -— - 



■i— 1 '- -' * 



where e^ is the number of TTi-labeled edges in a n /3. Thus for f3 £ I^ 
(23) -,7^ < nial,] < ^ 



We start by seeking estimates on the size of I^ for Z > 1. Fix some choice of I 
edges of a. We start by counting the cycles in I^ that share exactly these edges 
with a. We illustrate this in Figure [S] Call the graph consisting of these edges H, 
and suppose that H has p components. Since it is a forest, H has I +p vertices. 

Let Ai, . . . ,Ap be the components of H. We can assemble any element /3 € I^ 
that overlaps with a in 7J by stringing together these components in some order, 
with other edges in between. Each component can appear in /3 in one of two 
orientations. Since the vertices in /3 have no fixed ordering, we can assume without 
loss of generality that (3 begins with component Ai with a fixed orientation. This 
leaves {p — 1)12^^^ choices for the order and orientation oi A2, ... ,Ap in /3. 

Imagine now the components laid out in a line, with gaps between them, and 
count the number of ways to fill the gaps. Suppose that f3 is to have length j. Each 
of the p gaps must contain at least one edge, and the total number of edges in all 
the gaps is j — I. Thus the total number of possible gap sizes is the number of 
compositions of j — I into p parts, or ("'~ 7^)- 

Now that we have chosen the number of edges to appear in each gap, we choose 
the edges themselves. We can do this by giving an ordered list j — p — I vertices to 
go in the gaps, along with a label and an orientation for each of the j — I edges this 
gives. There are [n — p — /]j_p_/ ways to choose the vertices. We can give each new 
edge any orientation and label subject to the constraint that the word of the cycle 
we construct must be reduced. This means we have at most 2d — 1 choices for the 
orientation and label of each new edge, for a total of at most {2d — iy^\ 

AU together, there are at most (p - iy.2P-'^{^~[~'^)[n - p - ;]j_p_/(2rf - l)-?"' 
elements of X, that overlap with the cycle a at the subgraph H. We now calculate 
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10 +71"! 



The cycle a, with H dashed. 
The subgraph H has components 
Ai, . . . , Ap. In this example, the 
number of componnents of H is 
p — i, the size of a is fc = 11, and 
the number of edges in. H \s I — A. 

In this example, we will construct a 
cycle (3 of length j = 10 that overlaps 
with a at H. 




Step 2. Next, we choose how many 
edges will go in each gap between 
components. Each gap must contain 
at least one edge, and we must add a 
total oi j — I edges, giving us (■'p_7^) 
choices. In this example, we have 
added one edge after Ai, three after 
A3, and two after A2. 



7r2 



TTS 



TTl 



TTl 



5 10 



Step 1. We lay out the components 
Ai, . . . ,Ap. We can order and orient 
A2, ■ . ■ , Ap however we would like, 
for a total of {p — 1)12^^^ choices. 
Here, we have ordered the compo- 
nents ^1, ^3, A2, and we have re- 
versed the orientation of A3 . 




Step 3. We can choose the new ver- 
tices in [n — p — l]j-p-i ways, and we 
can direct and give labels to the new 
edges in at most (2d — 1)^ ' ways. 



Figure 5. Assembling an element /3 € I^ that overlaps with a at 
a given subgraph H. 



the number of different ways to choose a subgraph H of a with I edges and p 
components. Suppose a is given as in (J19p . We first choose a vertex Sig. Then, 
we can specify which edges to include in H by giving a sequence ai, &i, . . . , ap, &p 
instructing us to include in H the first ai edges after Si^, then to exclude the next 
61, then to include the next 02, and so on. Any sequence for which a^ and bi are 
positive integers, ai + ■ ■ ■ + Up — I, and bi + ■ ■ ■ + bp = k — I gives us a valid choice 
of I edges of a making up p components. This counts each subgraph H a total 
oi p times, since we could begin with any component of H. Hence the number of 
subgraphs H with I edges and p components is {k/p){^ Zi)(' _7 )• This gives us 
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the bound 



l~l\ fk~l-l 



2 
We apply the bounds 






(^^!/)[«-p-^w-;(2d-ir'. 



< 



to get 



|linl,|<K2<i-irV-i-%-.-, 1+ E -(^^^^^j ,„_! _,],_. 



p=2 



Since r < n^' ^"^j the sum in the above equation is bounded by an absolute constant. 
Applying this bound and (|23|) . for any a £ Ik and / > 1, 






( k{2d-iY-' \ 



Therefore 

r k — 1 



EE E cov(/„,/,) = 5: E E E cov(/„,/,) 

aei i>i ,3ei^ fc=i aeik i=i ^gi^ 

fc=l aelfc i=l ^ 

^ ,A [n]ka{d,k) ^ (k{2d - If-^ 

^ 2k 
fe=i 

fc=i ^ "■ 



24 = O ^ '- 

V n 

Last, we must bound X^QeiX^flei" Cov(/c,/^). For any word w, let e^ be the 
number of appearances of tt^ and 7r~ in w. Let a and /3 be cycles with words w 
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and u respectively, and let A; = |a| and j — |/?|. Suppose that /? G I^. Then 



< ^ ' ' W^^ < 



d ^ 



by Lemma 1211 For any pair of words w £ Wk and u e VVj , there are at most 
[n]fc[n]j pairs of cycles a,/3 d X with words w and u, respectively. Enumerating 
over all w £ Wk and u £ Wj, we count each pair of cycles a, /3 exactly 4fcj times. 
Thus 



< 



1 + 0{r'^/n) 



Y.e-,j:n- 



Akin 

■' \weWk MSWj 



The vector X^wew ^^ ^^^ every entry equal by symmetry, as does X^msw ^"- Thus 
each entry of X^^ew. ^^ ^^ ka{d,k)/d, and each entry of X^msw ^" ^^ j'^{d,j)/d. 
The inner product in the above equation comes to kja(d, k)a(d,j)/d, giving us 

V^ V^ rr..r(T r ^^ a{d,k)a{d,j)il+0{ryn)) 

^^■{2d-iy+^-^ 



Summing over all 1 < k, j < r, 

\ n 



(25) ^^Cov(/„,/,)=f (^^-5^" 



We can now combine equations (pij) . (|22l) . (IM| . and (P5|) with Proposition [T^ to 
show that 

(26) dTy(I,Y) = 0^(^'^~^)""' 

Step 3. Approximation ofY by Z. 
By Lemma [20] and ^, 



dMY, z)<Y: |Er„ - Ez„| < E E (i - ;^) 

^ y- a(d, fc) A _ [n\k 
^ 2k \ n'' 

Since [n]k > n''{l - k'^/2n), 

^-^ An 



k=l 
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Together with (|26l) , this bounds the total variation distance between the laws of I 
and Z and proves the theorem. D 

The distributions of any functionals of I and Z satisfy the same bound in total 
variation distance. This gives us several results as easy corollaries, including an 
improvement on |DJPPlll Theorem 11]. 

Corollary 23. 

i) Let {Zk, 1 < k < r) be a vector of independent Poisson random variables with 
'EZk = a{d,k)/2k. Let Ck denote the number of k- cycles in Gn, o, 2d-regular 
permutation random graph on n vertices. Then for some absolute constant c, 

c(2d-l)2'^-i 



dTv{{Ck, l<k<r), (Zfe, 1 < A: < r)) 



< 



n 

ii) Let {Z^, w G W/^) be a vector of independent Poisson random variables with 
'EiZw = l/h(w). Let Cw denote the number of cycles with word w in Gn, o- 
2d-regular permutation random graph on n vertices. Then for some absolute 
constant c, 

c(2d- 1)2^-1 



dTy((a«, w e W'k), {Zy,, w e W'k)) 



< 



Proof. Observe that Ck = X^aei -^"i ^^'^ ^'^^^ ^^ ^^ define Zk = J2aex ^"' then 
{Zk, 1 < k < r) is distributed as described. Thus Q follows from Theorem [T51 

To prove ([u]), note that C^ = ^^ /q, where the sum is over all cycles in I with 
word w. We then define Z^, as the analogous sum over Zq,. Since the number 
of cycles in I with word w is [n]k/h{w), we have EZ^, — l/h{w), and the total 
variation bound follows from Theorem 1181 D 

We can also use Theorem [18] to bound the likelihood that G„ contains two 
overlapping cycles of size r or less. 

Corollary 24. Let Gn be a 2d-regular permutation random graph on n vertices. 
Let E be the event that Gn contains two cycles of length r or less with a vertex in 
common. Then for some absolute constant c' , for all d > 2 and n,r >1, 

PiE] < 'm^^. 

n 

Proof. Let E' be the event that Za = Zp = 1 for two cycles a, /3 G X that have a 
vertex in common. By Theorem 1181 

PiE]<PiE']+'^'^''-'^''-\ 
n 

For any cycle a G Ifc, there are at most k[n — l]j-ia{d, j) cycles in X, that share 
a vertex with a. For any such cycle /3, the chance that Za = 1 and Zp = 1 \a less 
than l/[n]fe[n]j. By a union bound, 

^ ^ - Z^ 2k ^ \n\k\n\, 

fe=i i=i ^ J"^ " 



aid,k)aid,j) _^^(2d-i^2r 
fc=i j=i 



^ y^y^ "-V"->"-;"-'."i.y; ^ Q / v^"~ ^^ \ g 
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Proof of Theorem\n\ When d = 1, there is only one word of each length in W^i 
and statement ^ reduces to the well-known fact that the cycle counts of a random 
permutation converge to independent Poisson random variables (see [AT92] for 
much more information on this). In this case, G{t) is made up of disjoint cycles for 
all times t, so that statement ^ is trivially satisfied. 

in) 

When d > 2, let Ci, be the number of cycles with word w in G„, as in Corol- 
lary [2313 The random vector (Cu.(t), w e Wj^) is a mixture of the random vectors 
{Cw \ w G W^) over different values of n. That is, 

oo 

P [{cut), weWK)eA]=Y, P[Mt = n]P [{ci"\ w e Wj,) e A 

n=l 

for any set A, recalling that G{t) = Guf Corollarv l23iil together with the fact that 
P[Mt > A^] — > 1 as t — > oo for any N imply that {Cw{t), w £ Wj^) converges in law 
to [Zw, w € VV^), establishing statement jll. Statement ([HI follows in the same 
way from Corollary [Ml □ 

Proof of LemmalWA We will apply the Stein-Chen method directly. Define the 
operator A by 

Ah{x) = ^ E[Zq] {h{x + Cq) - h{x)) + ^Xa {h{x - e^) - h{x)) 
aex aex 

\x\ \x\ 

for any h: TL\_ — > R and x G T,\_ . This is the Stein operator for the law of Z, and 

E^/i(Z) = for any bounded function h. By Proposition 10.1.2 and Lemma 10.1.3 
in [BHJ92J , for any set A C l}^ ' , there is a function h such that 

Ah{x)^\{x£A}-V\L(^A\., 

and this function has the property that 

(27) sup \h{x^ea)-h[x)\<\. 

qGI 

Thus we can bound the total variation distance between the laws of Y and Z by 
bounding |E^/i(Y)| over all such functions h. 
We write Ah^Y) as 



Ah{Y) = J2 nYa] {h{Y + e^)- /i(Y)) + ^ ^^ {h{Y - e„) - h{x)) 
+ Y, (E^c - Ey„) {h{Y + e„) - h{Y)) . 



aex 
The first two of these sums have expectation zero, so 

|E^/i(Y)| < Y, |E^" - Er„| \h{Y + e„) - h{Y)\ . 
aex 

By ([271), l^(Y + Ca) - h{Y)\ < 1, which proves the lemma. D 

Proof of Lemma \21\ We define a family of independent random maps ai and Ti for 
1 <i < d. Choose Ui uniformly from all injective maps from [ai] to [n], and choose 
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Ti uniformly from all injcctive maps from [bi] to [n]. Effectively, ai and n are 
random ordered subsets of [n]. We say that Ui and Ti clash if their images overlap. 



P[(7j and Ti clash for some *] — 1 ~ I I 7" 



For any 1 < i < d, 1 < j < Ui, and 1 < k < bi, the probability that ai{j) ~ Ti{k) is 
1/n. By a union bound, 

P[(Ti and Ti clash for some zl < y^ — ^ = — ^ — . 

"■^-^ n n 

We finish the proof by dividing both sides of this inequality by 11^=1 Woi+fci- '— ' 
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